The OpenVINO™ toolkit 2024.3 release enhances generative AI (GenAI) accessibility with improved large language model (LLM) performance and expanded model coverage. It also boosts portability and performance for deployment anywhere: at the edge, in the cloud, or locally. The top features of this release are:
Easier Model Access and Conversion
|
Product |
Details |
|---|---|
|
New Model Support |
Support for Phi-3-mini, a family of AI models that takes advantage of the power of small language models for faster, more accurate, and cost-effective text processing. Llama 3 optimizations for CPUs, built-in GPUs, and discrete GPUs for improved performance and efficient memory usage. |
|
Python* |
A Python custom operation is now enabled in the OpenVINO toolkit, making it easier for Python developers to code their custom operations instead of using C++ custom operations (also supported). This custom operation empowers you to implement your own specialized operations into any model. |
Generative AI and LLM Enhancements
Expanded model support and accelerated inference.
|
Product |
Details |
|---|---|
|
New Jupyter Notebooks |
An expansion to Jupyter Notebooks ensures better coverage for new models. The following noteworthy notebooks were added:
|
|
Performance Improvements for LLMs |
A GPTQ method for 4-bit weight compression was added to the Neural Networks Compression Framework (NNCF) for more efficient inference and improved performance of compressed LLMs. There are significant LLM performance improvements and reduced latency for built-in and discrete GPUs. |
OpenVINO™ toolkit is an open source toolkit that accelerates AI inference with lower latency and higher throughput while maintaining accuracy, reducing model footprint, and optimizing hardware use. It streamlines AI development and integration of deep learning in domains like computer vision, large language models (LLM), and generative AI.
What's New in 2024.3
Learn with like-minded AI developers by joining live and on-demand webinars focused on GenAI, LLMs, AI PC, and more, including code-based workshops using Jupyter* Notebook.
Convert and optimize models trained using popular frameworks like TensorFlow* and PyTorch*. Deploy across a mix of Intel® hardware and environments, on-premise and on-device, in the browser, or in the cloud.
Get started with OpenVINO and all the resources you need to learn, try samples, see performance, and more.
Review optimization and deployment strategies using the OpenVINO toolkit. Plus, use compression techniques with LLMs on your PC.
This is a commercial software platform that enables enterprise teams to develop vision AI models faster. With the platform, companies can build models with minimal data, and with OpenVINO integration, facilitate deploying solutions at scale.
Explore the Capabilities of the Intel® Geti™ PlatformWhen you are ready to go to market with your solution, explore ISV solutions that are built on OpenVINO. This ebook is designed to help you find a solution that best addresses your use-case needs, organized into sections, such as banking or healthcare, to help you navigate the solutions table easier.
Explore the AI Inference Catalog
Take advantage of add-ons that extend the possibilities of the toolkit, and implement existing and new functionality now available in the core toolkit.
Estimate deep learning inference performance on supported devices.
Use this add-on to build, transform, and analyze datasets.
This cross-platform, command-line tool facilitates the transition between training and deployment environments, performs static model analysis, and adjusts deep learning models for optimal performance on end-point target devices.
Use this framework based on PyTorch for quantization-aware training.
Hugging Face* has a repository for the OpenVINO toolkit that provides resources and models aimed at optimizing deep learning models for inference on Intel hardware.
This scalable inference server is for serving models optimized with the Intel® Distribution of OpenVINO™ toolkit.
Subscribe below to stay up to date with the latest Intel offerings.
All fields are required unless marked optional.
Explore ways to get involved and stay up-to-date with the latest announcements.
Optimize, fine-tune, and run comprehensive AI inference using the included model optimizer and runtime and development tools.
The productive smart path to freedom from the economic and technical burdens of proprietary alternatives for accelerated computing.